Getting BART to Ride the Idiomatic Train: Learning to Represent Idiomatic Expressions

نویسندگان

چکیده

Abstract Idiomatic expressions (IEs), characterized by their non-compositionality, are an important part of natural language. They have been a classical challenge to NLP, including pre-trained language models that drive today’s state-of-the-art. Prior work has identified deficiencies in contextualized representation stemming from the underlying compositional paradigm representation. In this work, we take first-principles approach build idiomaticity into BART using adapter as lightweight non-compositional expert trained on idiomatic sentences. The improved capability over baselines (e.g., BART) is seen via intrinsic and extrinsic methods, where idiom embeddings score 0.19 points higher homogeneity for embedding clustering, up 25% sequence accuracy processing tasks IE sense disambiguation span detection.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Idiomatic (gene) expressions.

Hidden among the myriad nucleotide variants that constitute each species' gene pool are a few variants that contribute to phenotypic variation. Many of these differences that make a difference are non-coding cis-regulatory variants, which, unlike coding variants, can only be identified through laborious experimental analysis. Recently, Cowles et al.1 described a screening method that does an en...

متن کامل

Idiomatic Expressions in VerbaLex

Idiomatic expressions are part of everyday language, therefore NLP applications that can “understand” idioms are desirable. The nature of idioms is somewhat heterogenous — idioms form classes differing in many aspects (e.g. syntactic structure, lexical and syntactic fixedness). Although dictionaries of idioms exist, they usually do not contain information about fixedness or frequency since they...

متن کامل

Phrasal Substitution of Idiomatic Expressions

Idioms pose a great challenge to natural language understanding. A system that can automatically paraphrase idioms in context has applications in many NLP tasks. This paper proposes a phrasal substitution method to replace idioms with their figurative meanings in literal English. Our approach identifies relevant replacement phrases from an idiom’s dictionary definition and performs appropriate ...

متن کامل

Normative data for idiomatic expressions

Idiomatic expressions such as kick the bucket or go down a storm can differ on a number of internal features, such as familiarity, meaning, literality, and decomposability, and these types of features have been the focus of a number of normative studies. In this article, we provide normative data for a set of Bulgarian idioms and their English translations, and by doing so replicate in a Slavic...

متن کامل

The Generation of Idiomatic and Collocational Expressions

Collocations whose semantic content is not or only partially composed from the semantic content of their parts are often viewed as problematic for generation. In this paper a tactical generator combining FUF as the generation engine and HPSG as the grammar framework is presented. It is shown, that the lexicon driven approach to syntactic and semantic processing is well-suited for the generation...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Transactions of the Association for Computational Linguistics

سال: 2022

ISSN: ['2307-387X']

DOI: https://doi.org/10.1162/tacl_a_00510